Pitch-synchronous time-scaling for prosodic and voice quality transformations

نویسندگان

  • João P. Cabral
  • Luís C. Oliveira
چکیده

Current time-domain pitch modification techniques have well known limitations for large variations of the original fundamental frequency. This paper proposes a technique for changing the pitch and duration of a speech signal based on time-scaling the linear prediction (LP) residual. The resulting speech signal achieves better quality than the traditional LP-PSOLA method for large fundamental frequency modifications. By using nonuniform time-scaling, this technique can also change the shape of the LP residual for each pitch period. This way we can simulate changes of the most relevant glottal source parameters like the open quotient, the spectral tilt and the asymmetry coefficient. Careful adjustments of these source parameters allows the transformation of the original speech signal so that it is perceived as if it was uttered with a different voice quality or emotion.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Epoch-Synchronous Overlap-Add (ESOLA) for Time- and Pitch-Scale Modification of Speech Signals

Timeand pitch-scale modifications of speech signals find important applications in speech synthesis, playback systems, voice conversion, learning/hearing aids, etc.. There is a requirement for computationally efficient and real-time implementable algorithms. In this paper, we propose a high quality and computationally efficient timeand pitch-scaling methodology based on the glottal closure inst...

متن کامل

A hybrid approach to synthesize high quality Cantonese speech

Synthesizing high quality speech necessitates an intelligent modification algorithm to adjust the important prosodic features of the pre-stored speech units to meet the desired output requirements, such as smoothness, naturalness and pleasantness. The time domain pitch-synchronous overlap and add (TD-PSOLA) scheme is a simple but effective method of varying the pitch and time-scaling of speech ...

متن کامل

Adjusting the Frame: Biphasic Performative Control of Speech Rhythm

Performative time and pitch scaling is a new research paradigm for prosodic analysis by synthesis. In this paper, a system for real-time recorded speech time and pitch scaling by the means of hands or feet gestures is designed and evaluated. Pitch is controlled with the preferred hand, using a stylus on a graphic tablet. Time is controlled using rhythmic frames, or constriction gestures, define...

متن کامل

Flexible harmonic/stochastic speech synthesis

In this paper, our flexible harmonic/stochastic waveform generator for a speech synthesis system is presented. The speech is modeled as the superposition of two components: a harmonic component and a stochastic or aperiodic component. The purpose of this representation is to provide a framework with maximum flexibility for all kind of speech transformations. In contrast to other similar systems...

متن کامل

Prosodic and segmental factors in foreign-accent conversion

We propose a signal processing method that transforms foreign-accented speech to resemble its native-accented counterpart. The problem is closely related to voice conversion, except that our method seeks to preserve the organic properties of the foreign speaker’s voice; i.e., only those features which cue foreign-accentedness are to be transformed. Our method operates at two levels: prosodic an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005